1 On Bidirectional English - Arabic Search
نویسندگان
چکیده
In Cross-Language Information Retrieval (CLIR), queries in one language retrieve relevant documents in other languages. Machine-Readable Dictionaries (MRD) and Machine Translation (MT) systems are important resources for query translation in CLIR. We investigate the use of MT systems and MRD to Arabic-English and English-Arabic CLIR. The translation ambiguity associated with these resources is the key problem. We present three methods of query translation using a bilingual dictionary for Arabic-English CLIR. First, we present the Every-Match (EM) method. This method yields ambiguous translations since many extraneous terms are added to the original query. To disambiguate query translation, we present the First-Match (FM) method that considers the first match in the dictionary as the candidate term. Finally, we present the Two-Phase (TP) method. We show that good retrieval effectiveness can be achieved without complex resources using the Two-Phase method for Arabic-English CLIR. We also empirically evaluate the effectiveness of the Arabic-English MT approach using short, medium, and long queries of TREC7 and TREC9 topics and collections. The effects of the query length to the quality of the MT-based CLIR are investigated. English-Arabic CLIR is evaluated via MRD and English-Arabic MT. The query expansion via post-translation approach is used to de-emphasize the extraneous terms introduced by the MRD and MT for English-Arabic CLIR.
منابع مشابه
Towards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore
Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...
متن کاملQArabPro: A Rule Based Question Answering System for Reading Comprehension Tests in Arabic
Problem statement: Extensive research efforts in the area of Natural Language Processing (NLP) were focused on developing reading comprehension Question Answering systems (QA) for Latin based languages such as, English, French and German. Approach: However, little effort was directed towards the development of such systems for bidirectional languages such as Arabic, Urdu and Farsi. In general, ...
متن کاملThe Reality of Arabic Fiction Translation into English: A Sociological Approach
English translations of texts associated with Arabic fiction remain largely unexplored from a sociological perspective. Drawing on Pierre Bourdieu’s sociology, this paper aims to examine the genesis of Arabic fiction translation into English as a socially situated activity. Works of Arabic fiction emerged in English translation in the early twentieth century. Since then, this intellectual field...
متن کاملTranslation Modeling with Bidirectional Recurrent Neural Networks
This work presents two different translation models using recurrent neural networks. The first one is a word-based approach using word alignments. Second, we present phrase-based translation models that are more consistent with phrasebased decoding. Moreover, we introduce bidirectional recurrent neural models to the problem of machine translation, allowing us to use the full source sentence in ...
متن کاملRecover Writing Trajectory from Multiple Stroked Image
The recovery of writing trajectory from offline handwritten image is generally regarded as a difficult problem [1]. This paper introduced a method to recover the writing trajectory from multiple stroked images by searching the best matching writing paths of template strokes. The searching procedure is guided by a matching cost function which is defined as the summation of positional distortion ...
متن کامل